Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Fix RBO introducing data races #89710

Merged
merged 1 commit into from
Aug 1, 2023
Merged

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Jul 31, 2023

RBO cannot safely duplicate a GTF_GLOB_REF tree without additional checks because this may introduce data races. To be able to duplicate such a tree we need to verify that the local defined is truly dead after the transformation.

Fix #78713

RBO cannot safely duplicate a GTF_GLOB_REF tree without additional
checks because this may introduce data races. To be able to duplicate
such a tree we need to verify that the local defined it is truly dead
after the transformation.

Fix dotnet#78713
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 31, 2023
@ghost ghost assigned jakobbotsch Jul 31, 2023
@ghost
Copy link

ghost commented Jul 31, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

RBO cannot safely duplicate a GTF_GLOB_REF tree without additional checks because this may introduce data races. To be able to duplicate such a tree we need to verify that the local defined it is truly dead after the transformation.

Fix #78713

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch jakobbotsch marked this pull request as ready for review July 31, 2023 13:24
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Diffs. As you can see from the diffs CSE generally comes along and removes the data race that RBO introduces in these cases, so the problem currently is more theoretical than not. But overall the diffs of the fix are quite small and removes this cross-phase dependency, so I think we can take the change. Let me know if you disagree, I don't have strong feelings about this.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this.

Wonder how hard it would be to clean this kind of thing up later on?

+       cmp      dword ptr [rcx+08H], 0
+       setne    cl
        ; gcrRegs -[rcx]
+       movzx    rcx, cl
        test     ecx, ecx
        je       SHORT G_M7346_IG06

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Aug 1, 2023

Wonder how hard it would be to clean this kind of thing up later on?

Sadly not that easy, probably. For example, one case looks like:

------------ BB01 [000..00C) -> BB05 (cond), preds={} succs={BB02,BB05}

***** BB01
STMT00000 ( 0x000[E-] ... 0x00A )
N007 (  9,  6) [000005] DA-XG------STORE_LCL_VAR int    V04 tmp1         d:2 $244
N006 (  9,  6) [000004] ---XG------                         └──▌  NE        int    <l:$283, c:$282>
N004 (  4,  4) [000002] ---XG------                            ├──▌  IND       ref    <l:$242, c:$243>
N003 (  2,  2) [000096] -------N---                            │  └──▌  ADD       byref  $180
N001 (  1,  1) [000000] -----------                            │     ├──▌  LCL_VAR   ref    V00 this         u:1 $80
N002 (  1,  1) [000095] -----------                            │     └──▌  CNS_INT   long   40 Fseq[<unknown field>] $140
N005 (  1,  1) [000003] -----------                            └──▌  CNS_INT   ref    null $VN.Null

***** BB01
STMT00002 ( ??? ... 0x00A )
N002 (  5,  4) [000011] DA---------STORE_LCL_VAR int    V05 tmp2         d:2 $VN.Void
N001 (  1,  1) [000006] -----------                         └──▌  LCL_VAR   int    V04 tmp1         u:2 <l:$280, c:$281>

***** BB01
STMT00001 ( ??? ... ??? )
N004 (  5,  5) [000010] -----------JTRUE     void   $VN.Void
N003 (  3,  3) [000009] J------N---                         └──▌  NE        int    <l:$280, c:$281>
N001 (  1,  1) [000007] -----------                            ├──▌  LCL_VAR   int    V05 tmp2         u:2 (last use) <l:$280, c:$281>
N002 (  1,  1) [000008] -----------                            └──▌  CNS_INT   int    0 $40

V05 is used in other blocks here initially, but through RBO those blocks are removed. So the later optimizations would kind of need to repeat what RBO does -- realize that both V04 and V05 are dead once we substitute the expression. But we have no optimization pass that runs with more up-to-date liveness after this point.

What could be doable is to do a "mini" CSE here inside of RBO by allowing the duplication here after realizing that it's fine if we then manually CSE [000002] after. But seems a bit messy.

@jakobbotsch jakobbotsch merged commit 728e32b into dotnet:main Aug 1, 2023
125 checks passed
@jakobbotsch jakobbotsch deleted the fix-78713 branch August 1, 2023 09:26
@ghost ghost locked as resolved and limited conversation to collaborators Aug 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: RBO still introduces data races
2 participants